Observations on Bagging
نویسندگان
چکیده
Abstract: Bagging is a device intended for reducing the prediction error of learning algorithms. In its simplest form, bagging draws bootstrap samples from the training sample, applies the learning algorithm to each bootstrap sample, and then averages the resulting prediction rules. More generally, the resample size M may be different from the original sample size N , and resampling can be done with or without replacement. We investigate bagging in a simplified situation: the prediction rule produced by a learning algorithm is replaced by a simple real-valued U-statistic of i.i.d. data. U-statistics of high order can describe complex dependencies, and yet they admit a rigorous asymptotic analysis. We show that bagging U-statistics often but not always decreases variance, whereas it always increases bias. The most striking finding, however, is an equivalence between bagging based on resampling with and without replacement: the respective resample sizes Mwith = αwithN and Mw/o = αw/oN produce very similar bagged statistics if αwith = αw/o/(1 − αw/o). While our derivation is limited to U-statistics, the equivalence seems to be universal. We illustrate this point in simulations where bagging is applied to cart trees.
منابع مشابه
Investigating the Effect of Underlying Fabric on the Bagging Behaviour of Denim Fabrics (RESEARCH NOTE)
Underlying fabrics can change the appearance, function and quality of the garment, and also add so much longevity of the garment. Nowadays, with the increasing use of various types of fabrics in the garment industry, their resistance to bagging is of great importance with the aim of determining the effectiveness of textiles under various forces. The current paper investigated the effect of unde...
متن کاملBagging Down-Weights Leverage Points
Bagging is a procedure averaging estimators trained on bootstrap samples. Numerous experiments have shown that bagged estimates often yield better results than the original predictor, and several explanations have been given to account for this gain. However, six years from its introduction, bagging is still not fully understood. Most explanations given until now are based on global properties ...
متن کاملBagging Classifiers for Fighting Poisoning Attacks in Adversarial Classification Tasks
Pattern recognition systems have been widely used in adversarial classification tasks like spam filtering and intrusion detection in computer networks. In these applications a malicious adversary may successfully mislead a classifier by “poisoning” its training data with carefully designed attacks. Bagging is a well-known ensemble construction method, where each classifier in the ensemble is tr...
متن کاملCross-Validated Bagged Learning.
Many applications aim to learn a high dimensional parameter of a data generating distribution based on a sample of independent and identically distributed observations. For example, the goal might be to estimate the conditional mean of an outcome given a list of input variables. In this prediction context, bootstrap aggregating (bagging) has been introduced as a method to reduce the variance of...
متن کاملPerformance of Porous Pavement Containing Different Types of Pozzolans
Underlying fabrics can change the appearance, function and quality of the garment, and also add so much longevity of the garment. Nowadays, with the increasing use of various types of fabrics in the garment industry, their resistance to bagging is of great importance with the aim of determining the effectiveness of textiles under various forces. The current paper investigated the effect of unde...
متن کامل